Rank in Wordlist | Word | Rank in Wordlist | Word |
---|---|---|---|
1 | ny | 26 | va |
2 | as | 27 | da |
3 | y | 28 | ad |
4 | er | 29 | liorish |
5 | ayns | 30 | elley |
6 | Ta | 31 | oc |
7 | yn | 32 | ee |
8 | dy | 33 | currit |
9 | ta | 34 | stiagh |
10 | sy | 35 | vlein |
11 | eh | 36 | cur |
12 | jeh | 37 | nel |
13 | She | 38 | ennym |
14 | ec | 39 | ayn |
15 | myr | 40 | jeh'n |
16 | Ta'n | 41 | nyn |
17 | smoo | 42 | cha |
18 | shen | 43 | goaill |
19 | lesh | 44 | chooid |
20 | agh | 45 | da'n |
21 | magh | 46 | jannoo |
22 | son | 47 | Va |
23 | rish | 48 | valley |
24 | vel | 49 | syn |
25 | echey | 50 | row |
The table shows the top-50 words of the corpus. Usually we see stopwords.
Language: Afrikaans
This list is a good candidate for a first stopword list for a language.
Usually a small, balanced corpus is enough to get a good list of high frequent words. But if the small corpus has some very prominent topic, this will be visible even in the top word lists.
select w_id-100 as rank_in_wordlist, word from words where w_id>100 order by w_id limit 50;
3.4 Sample words for different frequency ranges